Efficient and Accurate Label Propagation on Large Graphs and Label Sets

نویسندگان

  • Michele Covell
  • Shumeet Baluja
چکیده

Many web-based application areas must infer label distributions starting from a small set of sparse, noisy labels. Examples include searching for, recommending, and advertising against image, audio, and video content. These labeling problems must handle millions of interconnected entities (users, domains, content segments) and thousands of competing labels (interests, tags, recommendations, topics). Previous work has shown that graph-based propagation can be very effective at finding the best label distribution across nodes, starting from partial information and a weightedconnection graph. In their work on video recommendations, Baluja et al. [1] showed high-quality results using Adsorption, a normalized propagation process. An important step in the original formulation of Adsorption was re-normalization of the label vectors associated with each node, between every propagation step. That interleaved normalization forced computation of all label distributions, in synchrony, in order to allow the normalization to be correctly determined. Interleaved normalization also prevented use of standard linear-algebra methods, like stabilized bi-conjugate gradient descent (BiCGStab) and Gaussian elimination. This paper presents a method that replaces the interleaved normalization with a single pre-normalization, done once before the main propagation process starts, allowing use of selective label computation (label slicing) as well as large-matrix-solution methods. As a result, much larger graphs and label sets can be handled than in the original formulation and more accurate solutions can be found in fewer propagation steps. We also report results from using pre-normalized Adsorption in topic labeling for web domains, using label slicing and BiCGStab. Keywords-graph propagation, large-scale labeling, stabilized bi-conjugate gradient descent, Gaussian elimination, topic discovery, web domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient and Accurate Label Propagation on Dynamic Graphs and Label Sets

Many web-based application areas must infer label distributions starting from a small set of sparse, noisy labels. Previous work has shown that graph-based propagation can be very effective at finding the best label distribution across nodes, starting from partial information and a weightedconnection graph. In their work on video recommendations, Baluja et al. showed high-quality results using ...

متن کامل

Community Detection using a New Node Scoring and Synchronous Label Updating of Boundary Nodes in Social Networks

Community structure is vital to discover the important structures and potential property of complex networks. In recent years, the increasing quality of local community detection approaches has become a hot spot in the study of complex network due to the advantages of linear time complexity and applicable for large-scale networks. However, there are many shortcomings in these methods such as in...

متن کامل

Exact Inference for Multi-label Classification using Sparse Graphical Models

This paper describes a parameter estimation method for multi-label classification that does not rely on approximate inference. It is known that multi-label classification involving label correlation features is intractable, because the graphical model for this problem is a complete graph. Our solution is to exploit the sparsity of features, and express a model structure for each object by using...

متن کامل

Scalable Label Propagation for Multi-relational Learning on Tensor Product Graph

Label propagation on the tensor product of multiple graphs can infer multi-relations among the entities across the graphs by learning labels in a tensor. However, the tensor formulation is only empirically scalable up to three graphs due to the exponential complexity of computing tensors. In this paper, we propose an optimization formulation and a scalable Lowrank Tensor-based Label Propagation...

متن کامل

A simulation study on the performance of various label-free electronic biosensors

The efficient detection of charged biomolecules by biosensor with appropriate semiconducting nanomaterials and with optimum device geometry has caught tremendous research interest in the present decade. Here, the performance of various label-free electronic biosensors to detect bio-molecules is investigated by simulation technique. Silicon nanowire sensor, nanosphere sensor and double gate fiel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013